Word Sense Disambiguation Using Sense Examples Automatically Acquired from a Second Language
نویسندگان
چکیده
We present a novel almost-unsupervised approach to the task of Word Sense Disambiguation (WSD). We build sense examples automatically, using large quantities of Chinese text, and English-Chinese and Chinese-English bilingual dictionaries, taking advantage of the observation that mappings between words and meanings are often different in typologically distant languages. We train a classifier on the sense examples and test it on a gold standard English WSD dataset. The evaluation gives results that exceed previous state-of-the-art results for comparable systems. We also demonstrate that a little manual effort can improve the quality of sense examples, as measured by WSD accuracy. The performance of the classifier on WSD also improves as the number of training sense examples increases.
منابع مشابه
DALE: A Word Sense Disambiguation System for Biomedical Documents Trained using Automatically Labeled Examples
Automatic interpretation of documents is hampered by the fact that language contains terms which have multiple meanings. These ambiguities can still be found when language is restricted to a particular domain, such as biomedicine. Word Sense Disambiguation (WSD) systems attempt to resolve these ambiguities but are often only able to identify the meanings for a small set of ambiguous terms. DALE...
متن کاملWord Sense Disambiguation Using Semi-Supervised Naive Bayes with Ontological Constraints
Background. Word sense disambiguation (WSD) is the task of mapping an ambiguous word to its correct sense given its context. As high-quality sensetagged data is scarce and expensive to obtain, attention has shifted from fullysupervised to semi-supervised and knowledge-based approaches to WSD that rely on a lexical knowledge base such as WordNet instead of large amounts of hand-labeled data. Wha...
متن کاملAnalogical Word Sense Disambiguation
Word sense disambiguation is an important problem in learning by reading. This paper introduces analogical word-sense disambiguation, which uses human-like analogical processing over structured, relational representations to perform word sense disambiguation. Cases are automatically constructed using representations produced via natural language analysis of sentences, and include both conceptua...
متن کاملCombining Machine Readable Lexical Resources and Bilingual Corpora for Broad Word Sense Disambiguation
This paper describes a new approach to word sense disambiguation (WSD) based on automatically acquired "word sense division. The semantically related sense entries in a bilingual dictionary are arranged in clusters using a heuristic labeling algorithm to provide a more complete and appropriate sense division for WSD. Multiple translations of senses serve as outside information for automatic tag...
متن کاملEvaluating large-scale Knowledge Resources across Languages
This paper presents an empirical evaluation in a multilingual scenario of the semantic knowledge present on publicly available large-scale knowledge resources. The study covers a wide range of manually and automatically derived large-scale knowledge resources for English and Spanish. In order to establish a fair and neutral comparison, the knowledge resources are evaluated using the same method...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005